Evolving Stochastic Context-Free Grammars from Examples Using a Minimum Description Length Principle

نویسنده

  • Bill Keller
چکیده

This paper describes an evolutionary approach to the problem of inferring stochastic context-free grammars from nite language samples. The approach employs a genetic algorithm, with a tness function derived from a minimum description length principle. Solutions to the inference problem are evolved by optimizing the parameters of a covering grammar for a given language sample. We provide details of our tness function for grammars and present the results of a number of experiments in learning grammars for a range of formal languages.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Stochastic Categorial Grammars

Stochastic categorial grammars (SCGs) are introduced as a more appropriate formalism for statistical language learners to est imate than stochastic context free grammars. As a vehicle for demonstrating SCG estimation, we show, in terms of crossing rates and in coverage, that when training material is limited, SCG estimation using the Minimum Description Length Principle is preferable to SCG est...

متن کامل

Learning the Grammar of Human Activity from Video

Stochastic Context-Free Grammars (SCFG) have been shown to be useful for applications beyond natural language analysis, specifically vision-based human activity analysis. Vision-based symbol strings differ from natural language strings, in that a string of symbols produced by video often times contains noise symbols, making grammatical inference very difficult. In order to obtain reliable resul...

متن کامل

An MDL Approach to Learning Activity Grammars

Stochastic Context-Free Grammars (SCFG) have been shown to be useful for vision-based human activity analysis. However, action strings from vision-based systems differ from word strings, in that a string of symbols produced by video contains noise symbols, making grammar learning very difficult. In order to learn the basic structure of human activities, it is necessary to filter out these noise...

متن کامل

Unsupervised induction of stochastic context-free grammars using distributional clustering

An algorithm is presented for learning a phrase-structure grammar from tagged text. It clusters sequences of tags together based on local distributional information, and selects clusters that satisfy a novel mutual information criterion. This criterion is shown to be related to the entropy of a random variable associated with the tree structures, and it is demonstrated that it selects linguisti...

متن کامل

Learning context-free grammars to extract relations from text

In this paper we propose a novel relation extraction method, based on grammatical inference. Following a semisupervised learning approach, the text that connects named entities in an annotated corpus is used to infer a context free grammar. The grammar learning algorithm is able to infer grammars from positive examples only, controlling overgeneralisation through minimum description length. Eva...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007